Spatial SVS(Singing Voice Synthesis) aims to produce expressive, high-quality singing voices enriched with accurate spatial cues, thereby enhancing listener immersion.

Please wear headphones to listen.
Demo
Text: 才二十三现在的日子没有那么简单
Phoneme Sequence: <AP>, c, ai, ai, ai, er, sh, i, s, an, <AP>, x, ian, z, ai, ai, ai, d, e, r, i, z, i, <AP>, m, m, ei, iou, n, a, m, e, j, j, ian, d, an, <SP> (<SP> represents silence segments, and <AP> breaths sound)
Spatial Prompt: [STATIC] Source locates at left-front up quadrant, and pauses in left-front up quadrant.
GT
Mono + SP
Rmssinger + SP
ISDrama(sing)
Text: 成都带不走的只有你
Phoneme Sequence: <SP>, ch, eng, d, u, <SP>, d, ai, b, u, z, ou, d, e, <SP>, zh, i, i, iou, iou, n, i, i, i, <SP> (<SP> represents silence segments, and <AP> breaths sound)
Spatial Prompt: [STATIC] Source locates at right-front up quadrant, and pauses in right-front up quadrant.
GT
Mono + SP
Rmssinger + SP
ISDrama(sing)
Text: 你可以不用记得我的好
Phoneme Sequence: <AP>, n, i, k, e, e, i, b, u, u, iong, j, i, i, d, e, uo, uo, d, e, h, ao (<SP> represents silence segments, and <AP> breaths sound)
Spatial Prompt: [STATIC] Source locates at right up quadrant, and pauses in right up quadrant.
GT
Mono + SP
Rmssinger + SP
ISDrama(sing)
Text: 穿过时间的缝隙它依然真实地
Phoneme Sequence: ch, uan, g, uo, sh, i, j, ian, ian, d, e, f, eng, x, i, <AP>, t, a, a, i, i, r, an, zh, en, sh, i, d, e (<SP> represents silence segments, and <AP> breaths sound)
Spatial Prompt: [STATIC] Source locates at front up quadrant, and pauses in front up quadrant.
GT
Mono + SP
Rmssinger + SP
ISDrama(sing)